Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust CombinedDeletionPolicy for multiple commits #27456

Merged
merged 1 commit into from
Nov 23, 2017

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Nov 20, 2017

Today, we keep only the last index commit and use only it to calculate the minimum required translog generation. This may no longer be correct as we introduced a new deletion policy which keeps multiple index commits. This change adjusts the CombinedDeletionPolicy so that it can work correctly with a new index deletion policy.

Relates to #10708, #27367

Today, we keep only the last index commit and use only it to calculate
the minimum required translog generation. This may no longer be correct
as we introduced a new deletion policy which keeps multiple index
commits. This changes adjust the `CombinedDeletionPolicy` so that it can
work correctly with a new index deletion policy.

Relates to elastic#10708, elastic#27367
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// However, there are commits that are not deleted just because they are being snapshotted (rather than being kept by the policy).
// TODO: We need to distinguish those commits and skip them in calculating the minimum required translog generation.
long minRequiredGen = Long.MAX_VALUE;
for (IndexCommit indexCommit : commits) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can do some streaming java8 magic here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, but indexCommit.getUserData() throws IOException.

@dnhatn
Copy link
Member Author

dnhatn commented Nov 23, 2017

Thanks @bleskes.

@dnhatn dnhatn merged commit e95d18e into elastic:master Nov 23, 2017
dnhatn added a commit that referenced this pull request Nov 23, 2017
Today, we keep only the last index commit and use only it to calculate
the minimum required translog generation. This may no longer be correct
as we introduced a new deletion policy which keeps multiple index
commits. This change adjusts the CombinedDeletionPolicy so that it can
work correctly with a new index deletion policy.

Relates to #10708, #27367
@dnhatn dnhatn deleted the translog-deletionpolicy branch November 23, 2017 16:52
martijnvg added a commit that referenced this pull request Nov 24, 2017
* es/master: (38 commits)
  Backport wait_for_initialiazing_shards to cluster health API
  Carry over version map size to prevent excessive resizing (#27516)
  Fix scroll query with a sort that is a prefix of the index sort (#27498)
  Delete shard store files before restoring a snapshot (#27476)
  Replace `delimited_payload_filter` by `delimited_payload` (#26625)
  CURRENT should not be a -SNAPSHOT version if build.snapshot is false (#27512)
  Fix merging of _meta field (#27352)
  Remove unused method (#27508)
  unmuted test, this has been fixed by #27397
  Consolidate version numbering semantics (#27397)
  Add wait_for_no_initializing_shards to cluster health API (#27489)
  [TEST] use routing partition size based on the max routing shards of the second split
  Adjust CombinedDeletionPolicy for multiple commits (#27456)
  Update composite-aggregation.asciidoc
  Deprecate `levenstein` in favor of `levenshtein` (#27409)
  Automatically prepare indices for splitting (#27451)
  Validate `op_type` for `_create` (#27483)
  Minor ShapeBuilder cleanup
  muted test
  Decouple nio constructs from the tcp transport (#27484)
  ...
martijnvg added a commit that referenced this pull request Nov 24, 2017
* es/6.x: (30 commits)
  Add wait_for_no_initializing_shards to cluster health API (#27489)
  Carry over version map size to prevent excessive resizing (#27516)
  Fix scroll query with a sort that is a prefix of the index sort (#27498)
  Delete shard store files before restoring a snapshot (#27476)
  CURRENT should not be a -SNAPSHOT version if build.snapshot is false (#27512)
  Fix merging of _meta field (#27352)
  test: do not run percolator query builder bwc test against 5.x versions
  Remove unused method (#27508)
  Consolidate version numbering semantics (#27397)
  Adjust CombinedDeletionPolicy for multiple commits (#27456)
  Minor ShapeBuilder cleanup
  [GEO] Deprecate ShapeBuilders and decouple geojson parse logic
  Improve docs for split API in 6.1/6.x (#27504)
  test: use correct pre 6.0.0-alpha1 format
  Update composite-aggregation.asciidoc
  Deprecate `levenstein` in favor of `levenshtein` (#27409)
  Decouple nio constructs from the tcp transport (#27484)
  Bump version from 6.1 to 6.2
  Fix whitespace in Security.java
  Tighten which classes can exit
  ...
dnhatn added a commit that referenced this pull request Nov 24, 2017
The commit looks harmless, unfortunately it can break the engine flush
scheduler and the translog rolling. Both `uncommittedOperations` and
`uncommittedSizeInBytes` are currently calculated based on the minimum
required generation for recovery rather than the translog generation of
the last index commit. This is not correct if other index commits are
reserved for snapshotting even though we are keeping the last index
commit only.

This reverts commit e95d18e.
dnhatn added a commit that referenced this pull request Nov 24, 2017
The commit looks harmless, unfortunately it can break the engine flush
scheduler and the translog rolling. Both `uncommittedOperations` and
`uncommittedSizeInBytes` are currently calculated based on the minimum
required generation for recovery rather than the translog generation of
the last index commit. This is not correct if other index commits are
reserved for snapshotting even though we are keeping the last index
commit only.

This reverts commit e95d18e.
@dnhatn
Copy link
Member Author

dnhatn commented Nov 24, 2017

@bleskes I have reverted this PR. This looks harmless, unfortunately it can break the engine flush scheduler and the translog rolling. Both uncommittedOperations and uncommittedSizeInBytes are currently calculated based on the minimum required generation for recovery rather than the translog generation of the last index commit. This is not correct if other index commits are reserved for snapshotting even though we are keeping the last index commit only.

I will include this to #27367.

jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Nov 24, 2017
* master:
  Revert "Adjust CombinedDeletionPolicy for multiple commits (elastic#27456)"
  Transpose expected and actual, and remove duplicate info from message. (elastic#27515)
  [DOCS] Fixed broken link in breaking changes
  Backport wait_for_initialiazing_shards to cluster health API
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Nov 24, 2017
* master:
  Revert "Adjust CombinedDeletionPolicy for multiple commits (elastic#27456)"
  Transpose expected and actual, and remove duplicate info from message. (elastic#27515)
  [DOCS] Fixed broken link in breaking changes
  Backport wait_for_initialiazing_shards to cluster health API
@bleskes
Copy link
Contributor

bleskes commented Nov 25, 2017

This is not correct if other index commits are reserved for snapshotting even though we are keeping the last index commit only.

@dnhatn I wonder if we should wrap the CombinedDeletionPolicy with the SanpshotDeletionPolicy, rather than the other way around. Did you look into that?

@dnhatn
Copy link
Member Author

dnhatn commented Nov 26, 2017

@bleskes We can not distinguish between kept commits and snapshotted commits with the SnapshotDeletionPolicy. These commits are just exposed as undeleted commits from SnapshotDeletionPolicy. I am thinking to re-implement our SnapshotDeletionPolicy to support this. What do you think?

The actual issue is a single index commit assumption. This assumption is perfect and totally correct, however it no longer be correct if we update the translog (this PR) and index deletion policy. I am addressing this and will include it to #27367.

@dnhatn
Copy link
Member Author

dnhatn commented Nov 26, 2017

I wonder if we should wrap the CombinedDeletionPolicy with the SnapshotDeletionPolicy, rather than the other way around. Did you look into that?

@bleskes, Yes. We can use a CombinedDeletionPolicy in the SnapshotDeletionPolicy to solve this issue without modifying the SnapshotDeletionPolicy. Thank you for this great idea.

@dnhatn
Copy link
Member Author

dnhatn commented Nov 26, 2017

@bleskes Just to confirm that this works as expected 👍

@bleskes
Copy link
Contributor

bleskes commented Nov 26, 2017

@bleskes Just to confirm that this works as expected

Cool. I presume this goes in another PR? (let's continue the discussion, if need be, in #27367 so it's all in one place).

dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Nov 26, 2017
Today we can not distinguish between index commits that are kept by the
primary policy and those are kept for snapshotting with a
SnapshotDeletionPolicy. Since we enclose a SnapshotDeletionPolicy in a
CombinedDeletionPolicy, we also we can not distinguish between those
with a CombinedDeletionPolicy. This can be a problem if we update the
TranslogDeletionPolicy to keep the minimum translog generation of
undeleted index commits as we may keep the translog of a snapshotting
commit even though it is "deleted" by the primary policy.

To solve this, we enclose a CombinedDeletionPolicy in a
SnapshotDeletionPolicy and track if an index commit is deleted by the
primary policy, then use that value to maintain translog rather than the
actual deletion of an index commit.

Relates elastic#27456 elastic#27367
jasontedor added a commit that referenced this pull request Nov 27, 2017
* master:
  Skip shard refreshes if shard is `search idle` (#27500)
  Remove workaround in translog rest test (#27530)
  inner_hits: Return an empty _source for nested inner hit when filtering on a field that doesn't exist.
  percolator: Avoid TooManyClauses exception if number of terms / ranges is exactly equal to 1024
  Dedup translog operations by reading in reverse (#27268)
  Ensure logging is configured for CLI commands
  Ensure `doc_stats` are changing even if refresh is disabled (#27505)
  Fix classes that can exit
  Revert "Adjust CombinedDeletionPolicy for multiple commits (#27456)"
  Transpose expected and actual, and remove duplicate info from message. (#27515)
  [DOCS] Fixed broken link in breaking changes
jasontedor added a commit that referenced this pull request Nov 27, 2017
* 6.x:
  [DOCS] s/Spitting/Splitting in split index docs
  inner_hits: Return an empty _source for nested inner hit when filtering on a field that doesn't exist.
  percolator: Avoid TooManyClauses exception if number of terms / ranges is exactly equal to 1024
  Dedup translog operations by reading in reverse (#27268)
  Ensure logging is configured for CLI commands
  Ensure `doc_stats` are changing even if refresh is disabled (#27505)
  Fix classes that can exit
  Revert "Adjust CombinedDeletionPolicy for multiple commits (#27456)"
  Transpose expected and actual, and remove duplicate info from message.
@clintongormley clintongormley added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Translog :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. labels Feb 13, 2018
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v6.2.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants